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We propose a model for growing networks based on a finite memory of the nodes. The model 
shows stylized features of real-world networks: power law distribution of degree, linear preferential 
attachment of new links and a negative correlation between the age of a node and its link attachment 
rate. Notably, the degree distribution is conserved even though only the most recently grown part of 
the network is considered. This feature is relevant because real-world networks truncated in the same 
way exhibit a power-law distribution in the degree. As the network grows, the clustering reaches 
an asymptotic value larger than for regular lattices of the same average connectivity. These high- 
clustering scale-free networks indicate that memory effects could be crucial for a correct description 
of the dynamics of growing networks. 



Many systems can be represented by networks, i.e. as 
a set of nodes joined together by links. Social networks, 
the Internet, food webs, distribution networks, metabolic 
and protein networks, the networks of airline routes, sci- 
entific collaboration networks and citation networks are 
just some examples of such systems p[-pT|, Recently it 
has been observed that a variety of networks exhibit topo- 
logical properties that deviate from those predicted by 
random graphs Hfl). For instance, real networks display 
clustering higher than expected for random networks . 
Also, it has been found that many large networks are 
scale-free. Their degree distribution decays as a power- 
law that cannot be accounted for by the Poisson distri- 
bution of random graphs |12],[13| . The type of the degree 
distribution is of great importance for the functionality 
of the network [|l4|-|l6|. Beside the degree distribution, 
other features of the growth dynamics of real-world net- 
works are currently under investigation. For citation net- 
works, the Internet, and collaboration networks of sci- 
entists and actors, it has been shown |17],[18| that the 
probability for a node to obtain a new link is an increas- 
ing function of the number of links the node already has. 
This feature of the dynamics is called preferential attach- 
ment. Furthermore the aging of nodes is of particular 
interest p9|| . In the network of scientific collaborations, 
every node stops receiving links a finite time after it has 
been added to the network, since scientists have a finite 
time span of being active. Similarly, in citation networks, 
papers cease to receive links (citations), because their 
contents are outdated or summarized in review articles, 
which are then cited instead. Whether a paper is still 
cited or not, depends on a collective memory containing 
the popularity of the paper. 

In the current paper we address the study of growing 
complex networks from the perspective of the memory of 
the nodes. First, we present empirical evidence for the 
age dependence of the growth dynamics of the network of 
scientific citations. We find that old nodes are less likely 
to obtain links than nodes added to the network more 
recently. Second, motivated by this finding, we intro- 
duce a model of network self-organization that accounts 
for the three empirical features mentioned before: (1) 



power law distribution for the degree, (2) preferential at- 
tachment, and (3) negative correlation between age and 
attachment rate. The clustering of the generated net- 
works is higher than in corresponding regular lattices, 
justifying the name highly clustered scale-free networks. 



PREVIOUS MODELS 

The earliest and most basic model generating scale- 
free networks has been introduced by Barabasi and Al- 
bert |L1|] , henceforth we use the acronym BA-model. This 
model explicitly incorporates the preferential attachment 
in the dynamical rules. At each time step a new node is 
added to the network and new links are attached from 
this new node to old nodes. The probability that a node 
obtains an additional link is proportional to its current 
degree. It can be interpreted as an application of Si- 
mon's growth model in the context of networks pO| , pT| , 
readily explaining the emergent scaling in the degree dis- 
tribution. The BA-model has been successively modified 
reproducing the scale-free behavior of the connectivity 
distribution S-p3|. For the sake of clarity, in the re- 
maining of the paper we will refer to the BA-model as a 
well-established model of growing scale-free networks. 

Real-world networks have properties that cannot be 
accounted for by the BA-model. We find a discrepancy 
with respect to empirical data in the correlation between 
a node's age and its rate of acquiring links. For the net- 
work of scientific citations this correlation is negative: the 
mean rate of citations a paper receives decreases with in- 
creasing age. This is supported by citation rate data of 
the years 1987-1998, shown in Figure ffl. Except for the 
three first years prior to the publication year, the cita- 
tion rate decreases with age |2J| . In contradiction to this 
empirical result, in the BA-model the mean attachment 
rate is positively correlated with age. Here the attach- 
ment rate is proportional to the degree, being largest for 
the oldest nodes since these began accumulating links 
earliest. A further consequence of this feature is a strong 
positive correlation between the age of a node and its de- 
gree. This kind of correlation has not been found in the 



network formed by the hyperlinks of the World Wide Web 
p6| - We also notice that if the oldest nodes are disre- 
garded, the networks generated by the BA-model are not 
scale-free any more. However, real-world networks have 
shown to be scale-free even though they are truncated, 
i.e. the major part of the oldest nodes is disregarded. 
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FIG. 1. Data on the network formed by scientific publi- 
cations (nodes) and citations (directed links). Upper panel, 
circles: The number of papers published in a given year from 
1987 to 1998. Triangles: The total number of citations made 
in papers published in 1998 and referring to papers published 
in a given year []25j. The data for both curves have been ex- 
tracted from the ISI database |29J] . Lower panel: The average 
number of citations (incoming links) a paper received in 1998 
as a function of the paper's publication year. The values are 
obtained as the ratio between the values of the two curves in 
the upper panel. Considering only papers more than 3 years 
old (published before 1995) the rate of obtaining new citations 
decreases with age. This indicates that aging is an important 
feature of citation networks. 



GROWTH AND DEACTIVATION MODEL 

The shortcomings indicated in the previous paragraph 
motivate our attempt to model self-organization of scale- 
free networks. The approach presented here is based 
on the degree-dependent deactivation dynamics of the 
nodes. Preferential attachment and the convergence to a 



power-law degree distribution are shown to be emergent 
properties of the dynamics. 

The model describes the growth dynamics of a net- 
work with directed links. By ki we denote the in-degree 
of node i, i.e. the number of links pointing to node i. 
Each node of the network can be in two different states: 
active or inactive. A new node added to the network is 
always in the active state first. It receives links from sub- 
sequently generated nodes until it is deactivated. Then 
the node does not receive links any more. The transi- 
tion of a node from the active to the inactive state can 
be interpreted as a collective "forgetting" of the node 
since new nodes do not connect to it any more. For the 
construction of the model we assume that the probability 
rate P of deactivation decreases with the in-degree of the 
node. Considering for instance the case of citation net- 
works, this means that the more often a paper has been 
cited, the less likely it is forgotten. Specifically, we make 
the assumption that the deactivation probability can be 
written as P ex (fc + a) -1 , where a > is a constant bias. 

At any step of the time-discrete dynamics m nodes in 
the network are active, all the other nodes are inactive. 
As the initial condition we use a network consisting of m 
active, completely connected nodes. Then the dynamics 
runs as follows: 

1. Add a new node i to the network. The new node 
is disconnected at first, so ki = at this point. 

2. Attach m outgoing links to the new node i. Each 
node j of the m active nodes receives exactly one 



incoming link, thereby kj 
3. Activate the new node i. 



1. 



4. Deactivate one of the active nodes. The probability 
that the node j is deactivated is given by 



P(kj) = 



7-1 



(1) 



where a > is a constant bias and the normaliza- 
tion factor is defined as 7 — 1 = [J2ieA TI+Jt) 
The summation runs over the set A of the currently 
active nodes. 

5. Resume at 1. 

The average connectivity of the network is given by the 
number of outgoing links per node, m. It is worth noting 
that a node receives incoming links during the lifetime 
T it is active, and once inactive it will not receive links 
any longer. Thus for each node i the time Ti spent in the 
active state and the in-degree fe, are equivalent. 

The deactivation mechanism strongly simplifies the dy- 
namics of growing complex networks. Neither gradual ag- 
ing nor possible reactivation are taken into account. For 
instance, in the context of citation networks, the model 
does not consider the rediscovery of "forgotten" papers. 



Moreover, the functional form of the deactivation prob- 
ability might well differ from Eq. (|lj). However, we will 
show that the model reproduces several features of real 
growing networks. 



DEGREE DISTRIBUTION 

The distribution N(k) of the in-degree fc can be ob- 
tained analytically for the model defined above, consid- 
ering the continuous limit of k. Let us first derive the 
distribution pW (k) of the in-degree of the active nodes 
at time t. For k > 0, the time evolution is determined by 
the following master equation 



Jt+i 



) (fc + l) = (1-P(fc))p (t) (fc) 



= 1- 



7-1 
a + k 



pW(ifc) 



(2) 



where a and 7 are defined in step of the model defi- 
nition. The boundary value p(0) is a constant reflecting 
the constant rate of new nodes with initial k = 0. 

Assuming that the fluctuations of the normalization 
7 — 1 are small enough, such that 7 may be treated 
as a constant, the stationary case p( t+1 )(fc) = pW(jfc) of 
Eq. (§) yields 



p(k + 1) - p(k) = -1- Ip(fe) 
a + fc 

Treating fc as continuous we write 



dp 
dfc = 

and obtain the solution 



7 



1 



P(k) , 



p(fc) = &(a + fe) 



-7+1 



(3) 



(4) 



(5) 



with appropriate normalization constant b. In case the 
total number n of nodes in the network is large compared 
with the number m of active nodes, the overall degree 
distribution N(k) can be approximated by considering 
the inactive nodes only. Thus N(k) can be calculated as 
the rate of change of the degree distribution p(fc) of the 
active nodes. We find 



N(k) 



dp 

"dfc 



= c(a + k)-"< 



(6) 



with c = (7 — l)a 7_1 . The exponent 7 is obtained from 
a self-consistency condition obtained from the average 
connectivity 



m 



which gives 



(a + k)~ 



7 = 2+ — 
m 



-dk 



(7) 



(8) 



Thus the exponent 7 depends only on the ratio a/m. 
Similar expressions have been obtained for a version of 
the BA- model with directed links (HHt • Although the 
growth and deactivation model has been formulated for 
directed networks, it can be easily applied also to gener- 
ate undirected networks. 
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FIG. 2. Comparison of the degree distribution obtained for 
the undirected networks following the BA (dashed line) and 
the growth and deactivation model (solid line). In (a) the 
complete networks are considered after 5 x 10 4 time steps. In 
contrast, in (b) only the network formed by the newest nodes 
and their links is taken into account. In (c) we plot the maxi- 
mum degree, k max , observed in the truncated network against 
the truncation ratio A. In the BA model, k max scales as a 
power law with A. However, the degree distribution in the 
new model shows a power law distribution of degree, whose 
cutoff is only slightly affected by the finite size of the trun- 
cated network. All curves are averages over 100 independent 
simulation runs. 



Numerical results 

Figure g(a) shows the cumulative distribution of the to- 
tal degree k' = (m + fc) obtained by simulating the model 
for 5 x 10 4 time steps. We obtain a power law scaling for 
several decades, in agreement with the analytical result 
in Eq. O. The exponent found numerically is 1.9, slightly 
below the analytical result 7— 1 = 2 + a/m — 1 = 2 for the 
case a = m. The deviation can be explained by the con- 



tinuous limit used in the theoretical derivation of 7 and 
the assumption that 7 is a constant. Conducting further 
simulations for various values of m and a, we find that 
the fluctuations of 7 become smaller when increasing m 
and/or a. Then the discrepancy between analytical and 
numerical results decreases. Figure |](a) a ls° shows cor- 
responding simulation results for the BA model, using 
to = 10 and 5 x 10 4 time steps as well. In the range 
k' < 1000 we obtain almost the same distribution as for 
the growth and deactivation model. However, the main 
difference between both models is the presence of a cutoff 
at a lower value for the BA-modcl. 

Up to this point we have considered degree distribu- 
tions including all nodes of the network. However, in 
many cases empirical data contain only those nodes and 
links of the network that have been created most recently. 
For instance, studies on scientific citation networks |9|] 
are restricted to papers that are not older than 20 years, 
thereby ignoring the major part of the initial network. 
A pronounced power law regime is observed in the de- 
gree distribution of these truncated networks. Therefore 
it is important to investigate the robustness of the scale- 
free networks obtained from models under truncation in 
time. Figure @(b) shows the cumulative degree distri- 
butions analogous to Fig. ||(a), but now regarding the 
truncated network where the fraction A = 50% of oldest 
nodes and all their links are disregarded. Concerning the 
BA-model the effect of truncation is drastic. The trun- 
cated network does not exhibit a scale-free range in the 
degree distribution. This is different for the growth and 
deactivation model. The influence of the truncation on 
the degree distribution is a slight shift of the cutoff for 
high k' . In order to view systematically the effect of trun- 
cation, we consider the largest degree k' maxl occurring in 
the truncated network, as a function of the fraction A of 
disregarded nodes. According to Fig. ||(c), k' max decays 
as a power law (with an approximate exponent of 0.5, 
k' max ~ A~ - 5 ) for the BA-model. On the other hand, 
the new model introduced here exhibits only a weak de- 
pendence of the maximum degree on the truncation. 



LINEAR PREFERENTIAL ATTACHMENT 

Another relevant dynamical property is the degree- 
dependent attachment rate H(k). It is measured as fol- 
lows: Consider the set /C of nodes with degree k at 
a certain time t. Measure the average degree k + Afe 
of the nodes in K, at a later time t + At. Then let 
II(fc) = Ak/At. In recent studies of various growing 
networks, it has been found empirically that H(k) is an 
increasing function J17 18 ,27]]. This phenomenon is called 
preferential attachment. For the Internet and citation 
networks the preferential attachment is linear, H(k) oc k. 

We can calculate H(k) for the model introduced in 
the present Paper. At a time t, the network contains 
t nodes. tN(k) of these have degree k. The number of 



active nodes with degree k is mp(k). A time step later, 
At = 1 , each of the active nodes has increased its degree 
by 1, whereas the degree of the inactive nodes remains 
unchanged. Thus, according to Eqs. (||) and (||), the av- 
erage increase of the degree is 



H(fc) 



mp(k) 
tN(k) 



oc (a + k) 



(9) 



The model shows linear preferential attachment as an 
emergent property of the degree-dependent deactivation 
dynamics. 



AGE DISTRIBUTION 

Let us now consider the distribution of the age r of 
nodes receiving a new link. We define the time-dependent 
age distribution h(r, t) as the probability that a new link 
created at time t attaches to a node of age r , i.e. to a node 
created at time t — r. For the model defined here, the 
age distribution h is easy to obtain. Only active nodes 
receive links, and for these nodes their age r and their in- 
degree k have the same value. Therefore the probability 
that the node of age r obtains a new link is the same as 
the probability for a node with r links to be active, given 
by equation («). It is independent of t: 



h(r) oc (a + t) 



-7+1 



(10) 



For comparison we calculate the age distribution for the 
BA-model. Apart from small deviations, the total degree 
of the node i created at time U is pT| : 



• < 



0.5 



t 



t-T 



0.5 



(11) 



where the second equality is due to the substitution 
ti = t — t. The probability of obtaining a new link is 
proportional to the total degree, thus we find 



'•^^"'(i^f ^i*-^ 05 



(12) 



In the BA-modcl the probability of receiving a new link 
increases with the age of the node. In sharp contrast, 
the growth and deactivation model displays a forgetting 
of old nodes where the rate of forgetting is a power-law, 
Eq. (Q). Figure || shows plots of the age distributions 
for both models, to be compared with the empirical data 
in Fig. [l| The age distribution of the growth and de- 
activation model decays with r. This agrees with the 
empirical data on citation networks except for the first 3 
years after publication. 
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FIG. 3. Age distribution h(r, t) of nodes receiving links. 
In the growth and deactivation model the distribution (solid 
line) follows a power law decay with the age of the node. In 
contrast, in the B A- model (dashed line) it is the oldest nodes 
that are most likely to receive new links. For each of the two 
models the plotted data have been generated as an average 
over 100 independent simulation runs lasting 5 x 10 4 time 
steps. 



CLUSTERING COEFFICIENT 

The clustering coefficient C Q is one of the parameters 
used to characterize the topology of complex networks. It 
is a local property measuring the probability with which 
two neighbors of a node are also neighbors to each other 
(nodes i and j are neighbors if there is a link between i 
and j). It has been found that many real world networks 
present a clustering coefficient much larger than the cor- 
responding random graph, which scales with the system 
size N as C rand ~ (k)/N. 

Fig. [|(a) shows that for the growth and deactivation 
model the clustering coefficient tends towards an asymp- 
totic value (« 0.83) as the network grows. The analytical 
derivation of C is facilitated by the observation, that the 
clustering d of a node merely depends on the node's 
in-degree fej. A detailed calculation gives an asymptotic 
value C = 5/6 for the case of a — m considered here. 
Thus the model generates networks with a higher clus- 
tering than the corresponding one-dimensional regular 
lattices, Cm < 3/4. The large value of the clustering co- 
efficient and the fact that it does not decrease with net- 
work size is in qualitative agreement with recent data on 
the Internet p8[ . For the sake of comparison, in Fig. 0(b) 
the clustering coefficient of the BA-model is plotted for 
several network sizes N. Here the clustering clearly de- 
cays with increasing N. The quantitative behavior of the 
decay can be described by C ~ (In N) 2 /N. The detailed 
derivation of the clustering coefficients for both models 
is included in (Klemm and Eguiluz, unpublished work). 
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FIG. 4. Dependence of the clustering coefficient C on the 
size N of the network, (a) Growth and deactivation model 
for m — a = 2 (unfilled) and m — a = 10 (filled symbols). 
C approaches a high stationary value above 0.8. Note that 
corresponding one-dimensional regular lattices have C = 0.5 
(m = 2) and C = 0.71 (m = 10) respectively (b) BA-model 
for m = 2 (unfilled) and m — 10 (filled symbols). The clus- 
tering coefficient strongly decreases as the network grows, (c) 
The same data as in (b), but plotting (JVC) ' 5 as a function of 
N. This function is a straight line in a log-linear plot, indicat- 
ing that C scales as (In N) 2 /N for large N. Each data point 
is an average over 100 independent simulation runs. The clus- 
tering coefficient H is defined as follows: Consider a node i 
with total degree k[. Between the k[ nodes that i is linked 
with, at most k(k — l)/2 links are possible. Let d denote the 
fraction of links that actually exist among the neighbors of i. 
The clustering coefficient C is the average of a taken over all 
N nodes i in the network. Note that all links are considered 
as bidirectional when calculating the clustering coefficient. 



CONCLUSIONS 

The analysis of citation networks suggests a negative 
correlation between the age of a node and its probability 
to obtain further links. Older nodes are less likely to in- 
crease their connectivity than those added to the network 
more recently. Motivated by this finding, we have pro- 
posed and analyzed a new approach based on nodes with 



one degree of freedom, a memory, indicating the abil- 
ity of the node to attract further links. We have found 
that with the simple setting of the model the degree dis- 
tribution converges to a power law, where the exponent 
can be obtained analytically. As emergent properties of 
the model, (1) preferential attachment is obtained, a fea- 
ture observed recently in various real growing networks, 
and (2) the correlation between age and linking proba- 
bility is negative, in agreement also with the empirical 
results mentioned above. Unlike previous models, degree 
and age of nodes are uncorrelated in the model intro- 
duced here. Therefore the networks retain the power- 
law distribution of the degree even though only the most 
recent nodes are considered. This agrees with the fact 
that also truncated real-world networks are observed to 
be scale-free. Finally it is worth noting the resemblance 
of the grown networks to regular lattices. The highly 
clustered scale-free networks make a connection between 
scale-free networks and regular lattices. They define a 
new class of scale-free networks. Interesting extensions 
of the model include the introduction of random links, 
similarly to models of small- world networks. We expect 
to find a connection between scale-free growing networks 
and the small-world transition from regular lattices. Re- 
search along this line is in progress. 
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